Skip to content

Do predictor=2 differencing at sample width, not byte width#1498

Merged
brendancol merged 2 commits intomainfrom
fix-predictor2-multibyte
May 6, 2026
Merged

Do predictor=2 differencing at sample width, not byte width#1498
brendancol merged 2 commits intomainfrom
fix-predictor2-multibyte

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • TIFF predictor=2 specifies differencing between adjacent same-component samples at the sample's natural bit width. The CPU and GPU code was doing it byte-wise, which drops the inter-byte carry for any sample wider than one byte.
  • uint8 was correct (one byte per sample). uint16, int16, uint32, int32 silently produced wrong pixel values.
  • Reading a uint16/int16/uint32/int32 predictor=2 TIFF written by libtiff, GDAL, rasterio, or tifffile gave garbage. Writing such a file from xrspatial produced output other libraries decoded as garbage. The internal xrspatial round-trip happened to work because both ends used the same broken arithmetic.
  • Fix: dispatch predictor_decode/predictor_encode on CPU and GPU to per-dtype kernels (uint8/16/32/64) that operate on a typed view of the byte buffer with stride = samples_per_pixel.
  • Caught by the deep-pass geotiff accuracy audit (follow-up to CPU fp_predictor_decode gives wrong pixels for multi-band predictor=3 TIFFs #1247). Existing tests only covered uint8 multi-sample for predictor=2.

Test plan

  • pytest xrspatial/geotiff/tests/test_predictor_multisample.py (33 passed including new GPU+CPU multi-byte cases)
  • pytest xrspatial/geotiff/tests/ (533 passed; pre-existing matplotlib recursion failures in test_features plotting tests are unrelated to this change and reproduce on main)
  • tifffile interop: write a uint16 predictor=2 file with xrspatial, read with tifffile, and vice versa; both directions match
  • GPU CuPy backend exercised on host with CUDA available

TIFF predictor=2 differences adjacent same-component samples at the
sample's natural bit width. The CPU and GPU code was differencing
byte-wise instead, which drops the inter-byte carry for samples wider
than one byte. uint8 was fine because a sample fits in one byte.
uint16, int16, uint32, int32 silently came out wrong.

What broke:
- Reading a uint16/int16/uint32/int32 TIFF written with predictor=2 by
  libtiff, GDAL, rasterio, or tifffile gave garbage (off by tens of LSBs
  for typical data).
- Writing such a file from xrspatial produced output other libraries
  decoded as garbage. The xrspatial round-trip happened to work because
  both ends shared the same broken arithmetic.

Fix: rework predictor_decode/predictor_encode on CPU and GPU to view the
buffer as the matching unsigned dtype (uint8/16/32/64) in the file's
byte order and difference at that resolution with stride =
samples_per_pixel. The 8-bit byte path stays as a fast special case
since it was already correct.

Tests cover reads of tifffile-produced predictor=2 TIFFs across
uint16/int16/uint32/int32 and multi-band uint16, writer output verified
by tifffile reading it back, and GPU/CPU parity for the multi-byte
cases.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 6, 2026
@brendancol brendancol merged commit ad259d7 into main May 6, 2026
11 checks passed
@brendancol brendancol deleted the fix-predictor2-multibyte branch May 6, 2026 11:49
brendancol added a commit that referenced this pull request May 6, 2026
The third-pass agent worked from a stale local main and re-derived the
predictor=2 sample-wise fix that already shipped in #1498. That PR has been
closed as a duplicate. Only the predictor=3 big-endian fix in #1500 is a
genuine new finding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant